The vfio code has logic that checks if a FLR is possible and attempts it before and after device assignment. Replacing the FLR with a bus reset succeeds past the stuck option rom loading phase and we are able to boot into the guest successfully which means that the first initialization (by the hardware) changes something in the nvram that needs to be reset back to default by a hard (bus) reset.
We could add an ugly hack to vfio to do a bus reset for this specific card, but it should be noted that FLR if supported, should be able to take care of this condition. Note that it's really the FLR that's messing up the config space if it's attempted after the sequence of events leading upto the hang. It's easy to reproduce this using setpci writes to the card followed by a FLR in the following manner - #!/bin/bash setpci -v -s 03:00.0 4.w=2 setpci -v -s 03:00.0 4.w setpci -v -s 03:00.0 4.w=103 setpci -v -s 03:00.0 4.w setpci -v -s 03:00.0 78.l=1 setpci -v -s 03:00.0 78.l setpci -v -s 03:00.0 80.l=9430 setpci -v -s 03:00.0 80.l setpci -v -s 03:00.0 78.l=a30c setpci -v -s 03:00.0 78.l setpci -v -s 03:00.0 80.l=7fffffff setpci -v -s 03:00.0 80.l setpci -v -s 03:00.0 78.l=a5dc setpci -v -s 03:00.0 78.l setpci -v -s 03:00.0 80.l=0 setpci -v -s 03:00.0 80.l setpci -v -s 03:00.0 78.l=a2ec setpci -v -s 03:00.0 78.l setpci -v -s 03:00.0 80.l=3 setpci -v -s 03:00.0 80.l setpci -v -s 03:00.0 78.l=a408 setpci -v -s 03:00.0 78.l setpci -v -s 03:00.0 78.l=86420 setpci -v -s 03:00.0 78.l setpci -v -s 03:00.0 80.l=4 setpci -v -s 03:00.0 80.l echo 1 > reset #flr then completely corrupts the config space -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1284874 Title: Guest hangs during option rom loading with certain cards Status in QEMU: New Bug description: With a Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet card, device assignment does not work. The guest hangs during option rom execution. Moreover, if an attempt is made to quit qemu when the guest is in the hung state, the card gets into an inoperable state. Only a powercycle then, restores the card back into working order, just unloading/loading the driver does not help. Qemu version - 1.6.2 or current master Distribution - FC19 Kernel Version - 3.12.9-201.fc19.x86_64 Details of the card - # ethtool -i p2p2 driver: bnx2x version: 1.78.17-0 firmware-version: bc 7.8.22 bus-info: 0000:08:00.1 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: yes The output of lspci when the card is broken - 03:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet (rev ff) (prog-if ff) !!! Unknown header type 7f Kernel driver in use: bnx2x 00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 10: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 20: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 30: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff I will post if I get a chance to try out a newer than 7.8.22 for the option rom and see if this issue is fixed. However it appears we need to have a unified approach to automatically avoid loading the rom based on certain criteria. Manually, looking out for fixes to firmware and hard coding decisions based on those is neither desirable nor easy to maintain. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1284874/+subscriptions