Have you checked what "dmesg -w" shows while you start up a VM with your GPU passed to it? I was having similar issues before, and it turned out that I had to get libvirt to append the GPU to the VM after it was done booting because of some invalid rom issues. Could be worth a shot.

On 2016-01-27 11:13, Ryan Flagler wrote:
I pulled some different hardware from an unused machine and did some testing. So far, on my initial test, I am no longer seeing the driver crashes. It must have something to do with my motherboard. That's frustrating, because it was a Xeon E5 platform which I think Alex recommended in general.

Thanks for all the advice everyone.

On Tue, Jan 26, 2016 at 11:56 AM Ryan Flagler <ryan.flag...@gmail.com <mailto:ryan.flag...@gmail.com>> wrote:

    Thanks for the encouragement guys. I think I'm going to try some
    scrounge for some other hardware just to make sure my GPU isn't
    the problem. The only other cards I have are AMD which besides
    rebooting actually work solidly.

    On Tue, Jan 26, 2016 at 11:36 AM Ruben Felgenhauer
    <4felg...@informatik.uni-hamburg.de
    <mailto:4felg...@informatik.uni-hamburg.de>> wrote:

        Hi, Ryan!

        Installing an older Kernel is probably easier than you might
        think.
        On Ubuntu you should be able to find out which kernels are in
        the repos with apt-cache,
        but I sadly don't know the params, so maybe take a look at the
        manpage.
        And afterwards you should be able to install a specific
        version with 'apt-get install packagename=version'

        On Debian there is simply
        http://snapshot.debian.org/package/linux/ which is how I
        downgraded from 4.3 to 4.1 on my Debian testing.
        You can just download the deb files there and install them
        with dpkg.
        Maybe if you search for a testing system that is similar to
        Ubuntu, you could give that a try.

        But keep in mind that this doesn't uninstall the old kernel,
        so you will have a fallback.
        You might need to select the right kernel at GRUB though.

        Best regards,
        Ruben


        Am 26.01.2016 um 18:11 schrieb Will Marler:
        Well, you run Linux and you're experimenting with VGA
        passthrough ... you're resourceful! What about picking up a
        16GB SSD for $15
        
<http://www.amazon.com/Samsung-16GB-Solid-State-Drive/dp/B003YMJPE8/ref=sr_1_3?ie=UTF8&qid=1453827934&sr=8-3&keywords=16GB+SSD>
 and
        installing Arch (or Fedora, or Gentoo... whatever suits) side
        by side with Ubuntu? Presumably your VM can be launched
        either way without any configuration changes ... when you get
        tired/frustrated of the Arch/Fedora/Gentoo way you reboot
        back. If it works, you've found the answer, if it doesn't,
        you've improved your Linux-fu for not much (monetary) cost.


        On Tue, Jan 26, 2016 at 10:03 AM, Ryan Flagler
        <ryan.flag...@gmail.com> wrote:

            Yea, that's just a major jump. Wish I had a dedicated
            test system to try more things. ;)

            On Tue, Jan 26, 2016 at 10:34 AM Will Marler
            <w...@wmarler.com> wrote:

                Next up would be Kernel, it sounds like...

                On Tue, Jan 26, 2016 at 8:27 AM, Ryan Flagler
                <ryan.flag...@gmail.com> wrote:

                    Thanks for this info Will. Tried matching your
                    qemu/libvirt versions and I still get the driver
                    crashes. I'm not sure what else to try.

                    On Mon, Jan 25, 2016 at 9:20 PM Will Marler
                    <w...@wmarler.com> wrote:

                        Hey Ryan,

                        Here are the answers to your questions:

                        20:06:27 will~% uname -a
                        Linux haze 4.3.3-2-ARCH #1 SMP PREEMPT Wed
                        Dec 23 20:09:18 CET 2015 x86_64 GNU/Linux
                        20:07:01 will~% pacman -Q | egrep
                        '^linux|^libvirt|^qemu'
                        libvirt 1.3.1-1
                        libvirt-glib 0.2.2-1
                        libvirt-python 1.3.1-1
                        linux 4.3.3-2
                        linux-api-headers 4.1.4-1
                        linux-firmware 20151207.bbe4917-1
                        qemu 2.4.1-2

                        And here is the pastebin to my XML file:
                        http://pastebin.com/nB3DPkEr

                        As far as the guest drivers are concerned,
                        they're the "GeForce Game Ready Driver"
                        version 361.43.

                        HTH!

                        On Mon, Jan 25, 2016 at 10:12 AM, Ryan
                        Flagler <ryan.flag...@gmail.com> wrote:

                            Thanks Will. Here is my info with the
                            guest that crashes.

                            Host OS Info
                             ubuntu - 14.04.03
                             kernel - 3.19.0-47

                            virsh version
                             Compiled against library: libvirt 1.2.18
                             Using library: libvirt 1.2.18
                             Using API: QEMU 1.2.18
                             Running hypervisor: QEMU 2.5.0

                            patches
                             I did not manually apply any patches to
                            Qemu. Built directly from source.

                            Guest Info
                             Windows 10
                             nVidia Graphics Driver 361.43

                            Guest Event Viewer Entry On Driver Crash
                             Source - nvlddmkm
                             Event ID - 14
                             Info - \Device\Video3  CMDre 00000004
                            0000011c bad0011f 00000000 00d0011f

                            Guest XML - Attached


                            On Mon, Jan 25, 2016 at 10:18 AM Will
                            Marler <w...@wmarler.com> wrote:

                                On Mon, Jan 25, 2016 at 9:07 AM, Ryan
                                Flagler <ryan.flag...@gmail.com> wrote:

                                    Will, could you tell us the
                                    following?

                                    What Linux distribution on host?

                                Arch

                                    What kernel are you using on host?
                                    What libvirt version on host?
                                    What qemu version on host?

                                Will have to check when I'm home from
                                work & the kids are asnooze, but it's
                                whatever's latest (and I'm not using
                                the linux-vfio-lts kernel)

                                    What OS on guest?

                                Windows 10.

                                    What nvidia graphics driver
                                    version on guest?

                                Again, I'll have to check. But the
                                latest or nearly latest.

                                    My machines gpu driver crashes
                                    constantly and I'm trying to
                                    narrow down why. Thanks!

                                How frustrating : (. I'll also get a
                                pastebin of my XML for you, in case
                                that will help. I've been running
                                "stable" since mid 2015. I use the
                                quotes because some things tripped me
                                up (guest machine can't "sleep," can
                                only power on & power off; when host
                                machine goes to sleep with guest
                                running, on host wake-up the guest is
                                non-responsive and 100% CPU).

                                Will


                                    On Mon, Jan 25, 2016, 10:02
                                    AM Will Marler <w...@wmarler.com>
                                    wrote:

                                        This is discussed in
                                        
http://vfio.blogspot.com/2015/05/vfio-gpu-how-to-series-part-4-our-first.html.
                                        You have to do more than
                                        <kvm><hidden state='on'/></kvm>:

                                        "The GeForce card is nearly
                                        as easy, but we first need to
                                        work around some of the
                                        roadblocks Nvidia has put in
                                        place to prevent you from
                                        using the hardware you've
                                        purchased in the way that you
                                        desire (and by my reading
                                        conforms to the EULA for
their software, but IANAL). For this step we again need
                                        to run virsh edit on the VM.
                                        Within the <features>
                                        section, remove everything
                                        between the <hyperv> tags,
                                        including the tags
                                        themselves. In their place
                                        add the following tags:

                                        <kvm>
                                        <hidden state='on'/>
                                        </kvm>

                                        Additionally, within the
                                        <clock> tag, find the timer
                                        named hypervclock, remove the
                                        line containing this tag
                                        completely. Save and exit the
                                        edit session."

                                        I can confirm it works, I've
                                        been getting a lot of mileage
                                        from my passed-through 750Ti
                                        lately since getting a Steam
                                        Link :-D.

                                        On Sun, Jan 24, 2016 at 7:32
                                        AM, Ruben Felgenhauer
                                        <4felg...@informatik.uni-hamburg.de>
                                        wrote:

                                            Hi,

                                            finally I had time to
                                            this again. I tried out
                                            virt-manager and after a
                                            bit of playing around
                                            with it, it /somewhat/
                                            worked:

                                            The machine is at least
                                            booting. I still have a
                                            standard vga card enabled
                                            in the virt-manager
                                            config window.
                                            After the machine has
                                            booted, I can see that
                                            the device gets
                                            recognized as 750ti.
                                            However, the gpu doesn't
                                            get used, because of
                                            'Code 43'.
                                            Code 43 is a generic
                                            error, so any idea what
                                            it could mean in this case?

                                            Of course I added the
                                            <kvm><hidden
                                            state='on'/></kvm> lines
                                            at the associated position.

                                            Best regards,
                                            Ruben


                                            Am 18.01.2016 um 22:27
                                            schrieb Will Marler:
                                            I'm not sure what
                                            correct command-line
                                            syntax is. Have you
                                            tried using libvirt and
                                            VirtManager to handle
                                            your VM rather than
                                            command line, and
                                            modifying the XML rather
                                            than the command line? I
                                            think that's generally
                                            the preferred method
                                            these days (it's
                                            certainly easier from my
                                            point of view, and the
                                            way I got my 750 Ti to
                                            pass through).

                                            On Mon, Jan 18, 2016 at
                                            11:04 AM, Ruben
                                            Felgenhauer
                                            <4felg...@informatik.uni-hamburg.de>
                                            wrote:

                                                Hi, Alex!

                                                Thanks for your reply!
                                                My GPU indeed has a
                                                seperate audio
                                                device located at
                                                01:00.1.

                                                However, just adding
                                                -device
                                                vfio-pci,host=01:00.1 doesn't
                                                seem to do the trick.
                                                Of course the
                                                corresponding device
                                                is already
                                                blacklisted and
                                                bound to vfio.

                                                The Debian Wiki
                                                entry about VGA
                                                passthrough
                                                
(https://wiki.debian.org/VGAPassthrough)
                                                mentions QEMU
                                                arguments like
                                                "-device
                                                
vfio-pci,host=01:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on,romfile=...
                                                -device
                                                
vfio-pci,host=01:00.1,bus=pcie.0"
                                                which seems to
                                                address GPUs with
                                                audio devices, but
                                                if I try to do
                                                something similar,
                                                the buses 'root' and
                                                'pcie' couldn't be
                                                found. Maybe I
                                                missed something
                                                very important?

                                                On the same article,
                                                it says that the
                                                "HDMI soundcard
                                                [...] needs to be
                                                unbound from its
                                                driver":
                                                # echo
                                                '0000:01:00.1' |
                                                sudo tee
                                                
/sys/bus/pci/devices/0000:01:00.1/driver/unbind
                                                I figured the
                                                vfio-bind script
                                                from the Arch Linux
                                                Forum thread
                                                
(https://bbs.archlinux.org/viewtopic.php?id=162768)
                                                would do exactly
                                                this thing, so I
                                                didn't explicitly do
                                                so for the audio
                                                device. Is that okay?

                                                Best regards,
                                                Ruben


                                                Am 18.01.2016 um
                                                08:31 schrieb
                                                Alexander Petrenz:
                                                Hi Ruben,

                                                I guess your 750ti
                                                also has some audio
                                                device. You should
                                                pass through this
                                                too. It should be
                                                something like
                                                01:00.1. There are
                                                many command line
                                                examples you can
                                                find about that.
                                                Also I´m not quite
                                                sure, if you should
                                                remove the x-vga=on.

                                                Regards
                                                Alex

                                                On Sun, Jan 17,
                                                2016 at 11:12 PM,
                                                Ruben Felgenhauer
                                                
<4felg...@informatik.uni-hamburg.de>
                                                wrote:

                                                    Hi,

                                                    I am trying to
                                                    pass my nVidia
                                                    GTX 750ti to my
                                                    QEMU guest.

                                                    Problem is:
                                                    After the QEMU
                                                    monitor pops
                                                    up, nothing
                                                    happens. The
                                                    GPU's output is
                                                    dead, and the
                                                    vm won't be
                                                    accessible via
                                                    SSH anymore, so
                                                    it's very
                                                    likely that the
                                                    VM isn't
                                                    booting up at
                                                    all. Also,
                                                    there are no
                                                    error messages
                                                    from QEMU on
                                                    the console
                                                    whatsoever
                                                    which makes
                                                    debugging it
                                                    especially hard.

                                                    This is how I
                                                    start the vm
                                                    with normal vga
                                                    emulation:
                                                    qemu-system-x86_64
                                                    -hda vm.ovl
                                                    -boot c
                                                    -enable-kvm -m
                                                    1024 -cpu
                                                    host,kvm=off
                                                    -smp
                                                    cores=4,threads=2
                                                    -redir tcp:5022::22
                                                    Everything runs
                                                    fine in this
                                                    case. To do the
                                                    passthrough, I
                                                    add this:
                                                    -device
                                                    
vfio-pci,host=01:00.0,multifunction=on,x-vga=on
                                                    -vga none
                                                    This brings
                                                    said problems
                                                    with it. I also
                                                    tried out
                                                    multiple
                                                    different
                                                    combinations of
                                                    -device's
                                                    arguments or
                                                    even adding a
                                                    romfile for the
                                                    GPU, but none
                                                    of these steps
                                                    changed
                                                    anything at all.

                                                    Obviously, I am
                                                    using a BIOS
                                                    installation
                                                    and I'm
                                                    well-aware with
                                                    this bug:
                                                    
https://bugzilla.kernel.org/show_bug.cgi?id=107561,
                                                    but neither
                                                    using less RAM
                                                    (as you can see
                                                    I am using 1GB
                                                    now) nor
                                                    switching to an
                                                    older Kernel
                                                    changed
                                                    anything about
                                                    the problem. I
                                                    have tried
                                                    Kernel 4.1.0
                                                    and 4.3.0.

                                                    Host is Debian
                                                    testing with
                                                    QEMU 2.5.0.
                                                    I tried both
                                                    Debian and
                                                    Windows 7 as a
                                                    guest, but both
                                                    are showing
                                                    exactly the
                                                    same behaviour.
                                                    Mainboard is an
                                                    ASUS Z87-PLUS.
                                                    The 750ti is
                                                    produced by
                                                    ASUS aswell.

                                                    Any idea how I
                                                    could get
                                                    passthrough
                                                    running?

                                                    
_______________________________________________
                                                    vfio-users
                                                    mailing list
                                                    vfio-users@redhat.com
                                                    
https://www.redhat.com/mailman/listinfo/vfio-users




                                                
_______________________________________________
                                                vfio-users mailing list
                                                vfio-users@redhat.com
                                                
https://www.redhat.com/mailman/listinfo/vfio-users




                                        
_______________________________________________
                                        vfio-users mailing list
                                        vfio-users@redhat.com
                                        
https://www.redhat.com/mailman/listinfo/vfio-users







_______________________________________________
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users

_______________________________________________
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users

Reply via email to